Search CORE

54 research outputs found

Empirical Sufficiency Lower Bounds for Language Modeling with Locally-Bootstrapped Semantic Structures

Author: Chersoni Emmanuele
Prange Jakob
Publication venue
Publication date: 30/05/2023
Field of study

In this work we build upon negative results from an attempt at language modeling with predicted semantic structure, in order to establish empirical lower bounds on what could have made the attempt successful. More specifically, we design a concise binary vector representation of semantic structure at the lexical level and evaluate in-depth how good an incremental tagger needs to be in order to achieve better-than-baseline performance with an end-to-end semantic-bootstrapping language model. We envision such a system as consisting of a (pretrained) sequential-neural component and a hierarchical-symbolic component working together to generate text with low surprisal and high linguistic interpretability. We find that (a) dimensionality of the semantic vector representation can be dramatically reduced without losing its main advantages and (b) lower bounds on prediction quality cannot be established via a single score alone, but need to take the distributions of signal and noise into account.Comment: To appear at *SEM 2023, Toront

arXiv.org e-Print Archive

Is Structure Necessary for Modeling Argument Expectations in Distributional Semantics?

Author: Blache Philippe
Chersoni Emmanuele
Lenci Alessandro
Santus Enrico
Publication venue
Publication date: 01/01/2017
Field of study

Despite the number of NLP studies dedicated to thematic fit estimation, little attention has been paid to the related task of composing and updating verb argument expectations. The few exceptions have mostly modeled this phenomenon with structured distributional models, implicitly assuming a similarly structured representation of events. Recent experimental evidence, however, suggests that human processing system could also exploit an unstructured "bag-of-arguments" type of event representation to predict upcoming input. In this paper, we re-implement a traditional structured model and adapt it to compare the different hypotheses concerning the degree of structure in our event knowledge, evaluating their relative performance in the task of the argument expectations update.Comment: conference paper, IWC

arXiv.org e-Print Archive

HAL AMU

Archivio della Ricerca - Università di Pisa

Measuring Thematic Fit with Distributional Feature Overlap

Author: Blache Philippe
Chersoni Emmanuele
Lenci Alessandro
Santus Enrico
Publication venue
Publication date: 01/01/2017
Field of study

In this paper, we introduce a new distributional method for modeling predicate-argument thematic fit judgments. We use a syntax-based DSM to build a prototypical representation of verb-specific roles: for every verb, we extract the most salient second order contexts for each of its roles (i.e. the most salient dimensions of typical role fillers), and then we compute thematic fit as a weighted overlap between the top features of candidate fillers and role prototypes. Our experiments show that our method consistently outperforms a baseline re-implementing a state-of-the-art system, and achieves better or comparable results to those reported in the literature for the other unsupervised systems. Moreover, it provides an explicit representation of the features characterizing verb-specific semantic roles.Comment: 9 pages, 2 figures, 5 tables, EMNLP, 2017, thematic fit, selectional preference, semantic role, DSMs, Distributional Semantic Models, Vector Space Models, VSMs, cosine, APSyn, similarity, prototyp

arXiv.org e-Print Archive

HAL AMU

Archivio della Ricerca - Università di Pisa

Recommended from our members

Logical Metonymy in a Distributional Model of Sentence Comprehension

Author: Blache Philippe
Chersoni Emmanuele
Lenci Alessandro
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2017
Field of study

International audienceIn theoretical linguistics, logical metonymy is defined as the combination of an event-subcategorizing verb with an entity-denoting direct object (e.g., The author began the book), so that the interpretation of the VP requires the retrieval of a covert event (e.g., writing). Psycholinguistic studies have revealed extra processing costs for logical metonymy, a phenomenon generally explained with the introduction of new semantic structure. In this paper, we present a general distributional model for sentence comprehension inspired by the Memory, Unification and Control model by Hagoort (2013, 2016). We show that our distributional framework can account for the extra processing costs of logical metonymy and can identify the covert event in a classification task

ScholarWorks@UMass Amherst

HAL AMU

Archivio della Ricerca - Università di Pisa

Compositionality as an Analogical Process: Introducing ANNE

Author: Blache Philippe
Chersoni Emmanuele
Lenci Alessandro
Rambelli Giulia
Publication venue: place:Stroudsburg
Publication date: 01/01/2022
Field of study

Usage-based constructionist approaches consider language a structured inventory of constructions, form-meaning pairings of different schematicity and complexity, and claim that the more a linguistic pattern is encountered, the more it becomes accessible to speakers. However, when an expression is unavailable, what processes underlie the interpretation? While traditional answers rely on the principle of compositionality, for which the meaning is built word-by-word and incrementally, usage-based theories argue that novel utterances are created based on previously experienced ones through analogy, mapping an existing structural pattern onto a novel instance. Starting from this theoretical perspective, we propose here a computational implementation of these assumptions. As the principle of compositionality has been used to generate distributional representations of phrases, we propose a neural network simulating the construction of phrasal embedding as an analogical process. Our framework, inspired by word2vec and computer vision techniques, was evaluated on tasks of generalization from existing vectors

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Extensive Evaluation of Transformer-based Architectures for Adverse Drug Events Extraction

Author: Chersoni Emmanuele
Portellia Beatrice
Santus Enrico
Scaboro Simone
Serra Giuseppe
Publication venue
Publication date: 08/06/2023
Field of study

Adverse Event (ADE) extraction is one of the core tasks in digital pharmacovigilance, especially when applied to informal texts. This task has been addressed by the Natural Language Processing community using large pre-trained language models, such as BERT. Despite the great number of Transformer-based architectures used in the literature, it is unclear which of them has better performances and why. Therefore, in this paper we perform an extensive evaluation and analysis of 19 Transformer-based models for ADE extraction on informal texts. We compare the performance of all the considered models on two datasets with increasing levels of informality (forums posts and tweets). We also combine the purely Transformer-based models with two commonly-used additional processing layers (CRF and LSTM), and analyze their effect on the models performance. Furthermore, we use a well-established feature importance technique (SHAP) to correlate the performance of the models with a set of features that describe them: model category (AutoEncoding, AutoRegressive, Text-to-Text), pretraining domain, training from scratch, and model size in number of parameters. At the end of our analyses, we identify a list of take-home messages that can be derived from the experimental data

arXiv.org e-Print Archive

AILAB-Udine@SMM4H 22: Limits of Transformers and BERT Ensembles

Author: Chersoni Emmanuele
Portelli Beatrice
Santus Enrico
Scaboro Simone
Serra Giuseppe
Publication venue
Publication date: 07/09/2022
Field of study

This paper describes the models developed by the AILAB-Udine team for the SMM4H 22 Shared Task. We explored the limits of Transformer based models on text classification, entity extraction and entity normalization, tackling Tasks 1, 2, 5, 6 and 10. The main take-aways we got from participating in different tasks are: the overwhelming positive effects of combining different architectures when using ensemble learning, and the great potential of generative models for term normalization.Comment: Shared Task, SMM4H, Transformer

arXiv.org e-Print Archive

Testing APSyn against Vector Cosine on Similarity Estimation

Author: Blache Philippe
Chersoni Emmanuele
Huang Chu-Ren
Lenci Alessandro
Santus Enrico
Publication venue: Hankookmunhwasa
Publication date: 01/01/2016
Field of study

Waseda University Repository

Representing Verbs with Rich Contexts: an Evaluation on Verb Similarity

Author: Blache Philippe
CHERSONI EMMANUELE
Huang Chu Ren
LENCI ALESSANDRO
Santus Enrico
Publication venue: country:USA
Publication date: 01/01/2016
Field of study

Several studies on sentence processing sug- gest that the mental lexicon keeps track of the mutual expectations between words. Current DSMs, however, represent context words as separate features, thereby loosing important information for word expectations, such as word interrelations. In this paper, we present a DSM that addresses this issue by defining verb contexts as joint syntactic dependencies. We test our representation in a verb similarity task on two datasets, showing that joint con- texts achieve performances comparable to sin- gle dependencies or even better. Moreover, they are able to overcome the data sparsity problem of joint feature spaces, in spite of the limited size of our training corpus

Archivio della Ricerca - Università di Pisa